Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from OpenMPI 3.1.0 to 2.1.2 on 'waterman' (TRIL-213) #3363

Merged

Conversation

bartlettroscoe
Copy link
Member

CC: @fryeguy52, @mhoemmen, @kddevin, @rppawlo

Description

Switches from OpenMPI 3.1.0 to OpenMPI 2.1.2 env on Power9 'waterman'. We were told by @nmhamster today that OpenMPI 3.1 really does not work on the Power9 (and is a known issue apparently) and to use the OpenMPI 2.1.2 env instead.

Motivation and Context

This appears to fix a bunch of failing tests including those in #3344, #3331 and perhaps others.

How Has This Been Tested?

On 'white' I ran:

$ bsub -x -Is -n 20 \
./checkin-test-atdm.sh cuda-opt-Power9-Volta70 \
  --enable-packages=Kokkos,Teuchos,Zoltan2,Ifpack2,Tpetra,SEACAS,Panzer \
 --local-do-all

and it returned:

99% tests passed, 2 tests failed out of 645

Subproject Time Summary:
Ifpack2    = 646.49 sec*proc (36 tests)
Kokkos     = 475.50 sec*proc (27 tests)
Panzer     = 7880.59 sec*proc (158 tests)
SEACAS     =  24.01 sec*proc (20 tests)
Teuchos    = 207.66 sec*proc (129 tests)
Tpetra     = 2326.59 sec*proc (173 tests)
Zoltan2    = 1269.06 sec*proc (102 tests)

Total Test time (real) = 1888.47 sec

The following tests FAILED:
        619 - PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4 (Failed)
        623 - PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-3 (Timeout)

This is worth trying on the full ATDM Trilinos build.

We were told today that OpenMPI 3.1.0 does not really work for this build on
'waterman' (and perhpas other Power8 and Power9 machines).  They must just
remove the env altogether.
@bartlettroscoe bartlettroscoe added the client: ATDM Any issue primarily impacting the ATDM project label Aug 28, 2018
@bartlettroscoe bartlettroscoe added the stage: in progress Work on the issue has started label Aug 28, 2018
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3

  • Build Num: 1438
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.9.3
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 3363
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH tril-213-openmpi-3-to-2
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA 18013e4
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 5bd0588

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 1132
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 3363
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH tril-213-openmpi-3-to-2
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA 18013e4
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 5bd0588

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 681
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 3363
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH tril-213-openmpi-3-to-2
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA 18013e4
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 5bd0588

Using Repos:

Repo: TRILINOS (bartlettroscoe/Trilinos)
  • Branch: tril-213-openmpi-3-to-2
  • SHA: 18013e4
  • Mode: TEST_REPO

Pull Request Author: bartlettroscoe

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3

  • Build Num: 1438
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.9.3
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 3363
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH tril-213-openmpi-3-to-2
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA 18013e4
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 5bd0588

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 1132
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 3363
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH tril-213-openmpi-3-to-2
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA 18013e4
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 5bd0588

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 681
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 3363
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH tril-213-openmpi-3-to-2
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA 18013e4
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 5bd0588


CDash Test Results for PR# 3363.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ rppawlo ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@bartlettroscoe bartlettroscoe merged commit 5f4a5c6 into trilinos:develop Aug 28, 2018
@bartlettroscoe bartlettroscoe removed the stage: in progress Work on the issue has started label Aug 28, 2018
DrBooom pushed a commit that referenced this pull request Aug 29, 2018
We were told today that OpenMPI 3.1.0 does not really work for this build on
'waterman' (and perhpas other Power8 and Power9 machines).  They must just
remove the env altogether.
tjfulle pushed a commit to tjfulle/Trilinos that referenced this pull request Dec 6, 2018
We were told today that OpenMPI 3.1.0 does not really work for this build on
'waterman' (and perhpas other Power8 and Power9 machines).  They must just
remove the env altogether.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: ATDM Any issue primarily impacting the ATDM project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants